Goto

Collaborating Authors

 size matter


'Meet hot, single firemen, score a prize': Newest way women are finding their love matches

FOX News

Fox News Flash top headlines are here. Check out what's clicking on Foxnews.com. In the year 2024, plenty of people are tired of swiping away in an effort to find a love match. Amid all the dating app fatigue, some people are going back to basics by getting out of the house and socializing to find a potential life partner. Single and The City, an events-based company, is helping match people looking for a specific type of person, no matter what type of person that might be.


Generalization in Decision Trees and DNF: Does Size Matter?

Neural Information Processing Systems

Recent theoretical results for pattern classification with thresh(cid:173) olded real-valued functions (such as support vector machines, sig(cid:173) moid networks, and boosting) give bounds on misclassification probability that do not depend on the size of the classifier, and hence can be considerably smaller than the bounds that follow from the VC theory. In this paper, we show that these techniques can be more widely applied, by representing other boolean functions as two-layer neural networks (thresholded convex combinations of boolean functions). For example, we show that with high probabil(cid:173) ity any decision tree of depth no more than d that is consistent with m training examples has misclassification probability no more than o ( ( (Neff VCdim(U) log2 m log d)) 1/2), where U is the class of node decision functions, and Neff::; N can be thought of as the effective number of leaves (it becomes small as the distribution on the leaves induced by the training data gets far from uniform). This bound is qualitatively different from the VC bound and can be considerably smaller. We use the same technique to give similar results for DNF formulae.


Size Matters: Metric Visual Search Constraints from Monocular Metadata

Neural Information Processing Systems

Metric constraints are known to be highly discriminative for many objects, but if training is limited to data captured from a particular 3-D sensor the quantity of training data may be severly limited. In this paper, we show how a crucial aspect of 3-D information–object and feature absolute size–can be added to models learned from commonly available online imagery, without use of any 3-D sensing or re- construction at training time. Such models can be utilized at test time together with explicit 3-D sensing to perform robust search. Our model uses a "2.1D" local feature, which combines traditional appearance gradient statistics with an estimate of average absolute depth within the local window. We show how category size information can be obtained from online images by exploiting relatively unbiquitous metadata fields specifying camera intrinstics.


Machine Learning for Forecasting: Size Matters

#artificialintelligence

Machine learning has been increasingly applied to solve forecasting problems. Classical forecasting approaches, such as ARIMA or exponential smoothing are being replaced by machine learning regression algorithms, such as XGBoost, Gaussian processes or deep learning. However, despite the increasing attention, there are still doubts about the forecasting performance of machine learning methods. Makridakis, one of the most prominent names in the forecasting literature, has recently presented evidence that classical methods systematically outperform machine learning approaches for univariate time series forecasting [1]. This includes algorithms such as the LSTM, multi-layer perceptron or Gaussian processes.


📐 Size Matters

#artificialintelligence

The recent emergence of pre-trained language models and transformer architectures pushed the creation of larger and larger machine learning models. Google's BERT presented attention mechanism and transformer architecture possibilities as the "next big thing" in ML, and the numbers seem surreal. OpenAI's GPT-2 set a record by processing 1.5 billion parameters, followed by Microsoft's Turing-NLG, which processed 17 billion parameters just to see the new GPT-3 processing an astonishing 175 billion parameters. To not feel complacent, just this week Microsoft announced a new release of its DeepSpeed framework (which powers Turing-NLG), which can train a model with up to a trillion parameters. That sounds insane but it really isn't.


Size Matters: Metric Visual Search Constraints from Monocular Metadata

Neural Information Processing Systems

Metric constraints are known to be highly discriminative for many objects, but if training is limited to data captured from a particular 3-D sensor the quantity of training data may be severly limited. In this paper, we show how a crucial aspect of 3-D information–object and feature absolute size–can be added to models learned from commonly available online imagery, without use of any 3-D sensing or re- construction at training time. Such models can be utilized at test time together with explicit 3-D sensing to perform robust search. Our model uses a "2.1D" local feature, which combines traditional appearance gradient statistics with an estimate of average absolute depth within the local window. We show how category size information can be obtained from online images by exploiting relatively unbiquitous metadata fields specifying camera intrinstics.


How Big Is The Map In 'Assassin's Creed: Odyssey'? Does Size Matter?

Forbes - Tech

The map of Damascus in the first Assassin's Creed was 0.13km². Since then, the map in each successive game has been bigger than the one before. Based on that history, we can expect the map for the upcoming AC: Odyssey to be bigger than last year's brilliant AC: Origins. Dimitras Galatas put together a short video comparing maps from several of the Assassin's Creed games. The map of Greece for Odyssey is shown at 130km², a substantial increase over the tiny map of Damascus and more than 2.5 times as large as AC: Origin's 80km² map of Egypt.


Size Matters: Cardinality-Constrained Clustering and Outlier Detection via Conic Optimization

arXiv.org Machine Learning

Plain vanilla K-means clustering is prone to produce unbalanced clusters and suffers from outlier sensitivity. To mitigate both shortcomings, we formulate a joint outlier detection and clustering problem, which assigns a prescribed number of datapoints to an auxiliary outlier cluster and performs cardinality-constrained K-means clustering on the residual dataset. We cast this problem as a mixed-integer linear program (MILP) that admits tractable semidefinite and linear programming relaxations. We propose deterministic rounding schemes that transform the relaxed solutions to feasible solutions for the MILP. We also prove that these solutions are optimal in the MILP if a cluster separation condition holds.


Big Data: Does Size Matter? Book Review

#artificialintelligence

Data has been the focal point of technology, machine-learning and artificial intelligence for decades now. While technology has evolved, so has the need for data. As per estimates from American multinational corporation Computer Sciences, we are expected to consume 44 times more data in 2020 than we did a decade ago. This is also evident from the fact that the vocabulary for data has changed over the years. While we would recognise terms like'gigabyte' and'terabyte', the millennials are now talking about'petabytes' and'zettabytes'.


When size matters: selection of training sets for support vector machines Future Processing

#artificialintelligence

The amount of data produced every day grows tremendously in most real-life domains, including medical imaging, genomics, text categorisation, computational biology, and many others. Although it appears beneficial at the first glance (more data could mean more possibilities of extracting and revealing useful underlying knowledge), handling massively large datasets became a challenging issue and attracts research attention, especially in the era of big data. This big data revolution affected many research fields, including statistics, machine learning, parallel computing, and computer systems in general [1]. Storing and analysing the acquired historical information should allow predicting the label of an incoming (unseen) feature vector, containing some quantified features of a given data example. If the labels are categorical, then we are to tackle the classification task (it's regression otherwise).